Unsupervised Image Clustering using Probabilistic Continuous Models and Information Theoretic Principles
نویسنده
چکیده
This thesis proposes a new method for unsupervised image clustering using probabilistic continuous models and information theoretic principles. Image clustering relates to content-based image retrieval systems. It enables the implementation of efficient retrieval algorithms and the creation of a user friendly interface to the database. The thesis presents a coherent theory for continuous probabilistic image modeling based on mixture of Gaussians densities. The continuous image modeling is extended to the modeling of an image-set created by a supervised or an unsupervised clustering process. Three ways of obtaining the image-set model are introduced and the difference between them is discussed. Supervised image-set (category) modeling is utilized to compare between the proposed continuous models and the more traditional discrete image modeling based on histograms. The unsupervised image clustering framework is based on a continuous version of a recently introduced information theoretic principle, the information bottleneck (IB). The clustering method is based on hierarchical grouping: Utilizing a Gaussian mixture model (GMM), each image in a given archive is first represented as a set of coherent regions in a selected feature space. Images are next grouped enabling the mutual information between the clusters and the image content to be maximally preserved. The appropriate number of clusters can be determined directly from the IB principle. Several clustering algorithms based on the IB principle are presented, including a comparison between them. An incremental method for calculating mutual information between images and their continuous representation, which is a byproduct of the IB principle, is introduced. Mutual information is then used as a quality measure for estimating image representations and clustering quality. Experimental results demonstrate the performance of the proposed clustering methods on a real image database. The influence of image representation on the clustering results is also discussed.
منابع مشابه
Applying the Information Bottleneck Principle to Unsupervised Clustering of Discrete and Continuous Image Representations
In this paper we present a method for unsupervised clustering of image databases. The method is based on a recently introduced information-theoretic principle, the information bottleneck (IB) principle. Image archives are clustered such that the mutual information between the clusters and the image content is maximally preserved. The IB principle is applied to both discrete and continuous image...
متن کاملUnsupervised Image Clustering Using the Information Bottleneck Method
A new method for unsupervised image category clustering is presented, based on a continuous version of a recently introduced information theoretic principle, the information bottleneck (IB). The clustering method is based on hierarchical grouping: Utilizing a Gaussian mixture model, each image in a given archive is first represented as a set of coherent regions in a selected feature space. Imag...
متن کاملProbabilistic Models for Generating, Modelling and Matching Image Categories
In this paper we present a probabilistic and continuous framework for supervised image category modelling and matching as well as unsupervised clustering of image space into image categories. A generalized GMM-KL framework is described in which each image or image-set (category) is represented as a Gaussian mixture distribution and images (categories) are compared and matched via a probabilisti...
متن کاملExtraction and 3D Segmentation of Tumors-Based Unsupervised Clustering Techniques in Medical Images
Introduction The diagnosis and separation of cancerous tumors in medical images require accuracy, experience, and time, and it has always posed itself as a major challenge to the radiologists and physicians. Materials and Methods We Received 290 medical images composed of 120 mammographic images, LJPEG format, scanned in gray-scale with 50 microns size, 110 MRI images including of T1-Wighted, T...
متن کاملA continuous and probabilistic framework for medical image representation and categorization
This work focuses on a general framework for image representation and image matching that may be appropriate for medical image archives. The proposed methodology is comprised of a continuous and probabilistic image representation scheme using Gaussian mixture modeling (GMM) along with information-theoretic image matching measures (KL). The GMM-KL framework is used for matching and categorizing ...
متن کامل